Search CORE

22 research outputs found

Classification of opinionated texts by analogy

Author: Pais Sebastião
Publication venue
Publication date: 01/01/2008
Field of study

With the disproportionate increase of theWorldWideWeb and the quantity of information services and their availability, we have an excessive accumulation of documents of various kinds. Despite the positive aspects this represents and the potential this causes, a new problem arises as we need capable tools and methodologies to classify a document as to its quality. Assessing the quality of a Web page is not easy. For the technical evaluation of the structure of Web pages, many are the works that have emerged. This thesis follows a different course. It seeks to evaluate the content of pages according to the opinions and feelings they highlight. The adopted basis criterion to assess the quality ofWeb pages is to examine the absence of opinions and feelings in the texts. When we consult information from the Web, how do we know exactly that the information is reliable and does not express opinions which are made available to the public feelings? How can we ensure when we read a text that we are not being misled by the author who is expressing his opinion or, once again, his feelings? How can we ensure that our own assessment is free from any judgment of value that we can defend? Because of these questions, the area of "Opinion Mining", "Opinion Retrieval", or "Sentiment Analysis", is worth being investigated as we clearly believe that there is much to discover yet. After a lot of research and reading, we concluded that we do not want to follow the same methodology proposed so far by other researchers. Basically, they work with objective and subjective corpora manually annotated. We think it is a disadvantage because these are limited corpora, once they are small, and cover a limited number of subjects. We disagree with another point. Some researchers only use one or several morphological classes, or specific words as predefined attributes. As we want to identify the degree of objectivity/subjectivity of sentences, and not documents, the more attributes we will have, the more accurate we expect our classification to be. We want to implement another innovation in our method. We want to make it as automatic as possible or, at least, the least supervised as possible. Assessed some gaps in the area, we define our line of intervention for this dissertation. As already mentioned, as a rule, the corpora used in the area of opinions are manually annotated and they are not very inclusive. To tackle this problem we propose to replace these corpora with texts taken from Wikipedia and texts extracted from Weblogs, accessible to any researcher in the area. Thus, Wikipedia should represent objective texts and Weblogs represent subjective texts (which we can consider that is an opinion repository). These new corpora bring great advantages. They are obtained in an automatic way, they are not manually annotated, we can build them at any time and they are very inclusive. To be able to say that Wikipedia may represent objective texts and Weblogs may represent subjective texts, we assess their similarity at various morphological levels, with manually annotated objective/subjective corpora. To evaluate this similarity, we use two different methodologies, the Rocchio Method and the Language Model on a cross-validation basis. By using these two different methodologies, we achieve similar results which confirm our hypothesis. With the success of the step described above, we propose to automatically classify sentences (at various morphological levels) by analogy. At this stage, we use different SVM classifiers and training and test sets built over several corpora on a cross-validation basis, to, once again, have several results to compare to draw our final conclusions. This new concept of quality assessment of a Web page, through the absence of opinions, brings to the scientific community another way of research in the area of opinions. The user in general is also benefited, because he has the chance, when he consults a Web page or uses a search engine, to know with some certainty if the information is true or if this is only one set of opinions/sentiments expressed by the authors, excluding thus their own judgments of value about what he sees.Com o aumento desmedido daWorldWideWeb e da quantidade de serviços de informação e respectiva disponibilização, deparamo-nos actualmente com uma acumulação excessiva de textos de diversas naturezas. Apesar dos aspectos positivos que isto representa e do potencial que acarreta, surge uma nova problemática que consiste na necessidade de existirem ferramentas e metodologias capazes de classificar um documento, quanto à sua qualidade. Avaliar a qualidade de uma página Web não é tarefa fácil. Relativamente às técnicas de avaliação da estrutura das páginas, muitos são os trabalhos que têm surgido. Esta tese segue um rumo diferente, com ela pretende-se avaliar o conteúdo das páginas segundo as opiniões e os sentimentos nelas evidenciados. O critério de base adoptado para avaliar a qualidade das páginas Web é a análise da ausência de opiniões e sentimentos nos textos. Quando consultamos informação proveniente da Web, como sabemos exactamente que essa informação é fiável e que não retrata meras opiniões ou expressa sentimentos de quem a disponibilizou ao público? Como podemos garantir que ao estarmos a ler um texto não estamos a ser induzidos em erro pelo seu autor que está a expressar a sua opinião ou mais uma vez os seus sentimentos? Como podemos garantir que a nossa própria avaliação é isenta de qualquer juízo de valor que possamos defender? Por surgirem estas perguntas, entendemos ser necessário investigar e trabalhar numa área que se denomina "Opinion Mining", "Opinion Retrieval", ou ainda "Sentiment Analysis", onde julgamos existir muito ainda por descobrir. Depois de muita pesquisa e leitura sobre a área em discussão, concluímos que não queríamos seguir a mesma metodologia que outros seguem. Basicamente trabalham com corpora objectivos e corpora subjectivos anotados de forma manual. Pensamos que é uma desvantagem, porque esses corpora são limitativos, uma vez que são pequenos e por isso abrangem um número restrito de assuntos. Outro aspecto acerca do qual discordamos é que alguns investigadores utilizam apenas uma(s) classe(s) morfológica(s), ou palavras predefinidas como características. Como queremos identificar frases, e não só textos, quanto mais características tivermos, mais exacta deverá ser a nossa classificação. Uma outra inovação que queremos implementar é tornar o nosso método o mais automático possível ou, pelo menos, o menos supervisionado possível. Avaliadas algumas lacunas existentes na área, definimos a nossa linha de intervenção para a realização desta dissertação. Como já foi mencionado, por norma, os corpora utilizados na área das opiniões são anotados manualmente e pouco abrangentes. Para combatermos esse problema propomos que para substituir esses mesmos corpora podemos utilizar textos extraídos do Wikipedia e textos extraídos de Weblogs, acessíveis a qualquer investigador na área. Deste modo, o Wikipedia representa textos objectivos e os Weblogs representam textos subjectivos (que podemos considerar que são um repositório de opiniões). Estes novos corpora por nós definidos trazem grandes vantagens: são obtidos de forma automática, não são anotados manualmente, podemos construí-los em qualquer altura, para qualquer língua e são bastante abrangentes. Para podermos afirmar que o Wikipedia representa textos objectivos e que os Weblogs representam textos subjectivos, avaliamos a sua similaridade, a vários níveis morfológicos, com os corpora (objectivos/subjectivos) anotados manualmente. Para avaliar essa similaridade, utilizamos duas metodologias diferentes, o Método de Rocchio e o Modelo da Linguagem, usando em ambos conjuntos de treino e de teste de todos os corpora e o conceito de validação cruzada. Ao utilizarmos estas duas metodologias diferentes, obtivemos resultados diferentes, que foi necessário compararmos para tirarmos as nossas conclusões, que resultaram na aprovação da nossa hipótese. Com o sucesso do passo acima descrito, passamos à classificação de frases (também a vários níveis morfológicos) que podem conter poucas ou muitas palavras. Nesta fase, utilizamos vários classificadores SVM, conjuntos de treino e de teste dos vários corpora e o conceito de validação cruzada, para mais uma vez podermos ter vários resultados que comparamos para tirar as nossas conclusões. Este novo conceito de avaliação da qualidade de uma página Web, através da ausência de opiniões, traz à comunidade científica um outro caminho de investigação na área das opiniões. O utilizador em geral também é beneficiado, pois tem a possibilidade de, ao consultar uma página Web ou efectuar uma pesquisa num motor de busca, saber com alguma certeza se a informação que visualiza é verídica ou se é apenas um conjunto de opiniões/sentimentos expressos pelos autores, excluindo, desta forma, os seus próprios juízos de valor acerca do que está a visualizar

UBibliorum repositorio digital da ubi

DRIPPS: a Corpus with Discourse Relations in Perfect Participial Sentences

Author: Cordeiro João
Leal António
Pais Sebastião
Silvano Maria da Purificação
Publication venue
Publication date: 01/01/2023
Field of study

The main objective of this paper is to introduce a new language resource for some varieties of Portuguese - European, Brazilian, Mozambican, and Angolan - and for British English, called DRIPPS (Discourse Relations In Perfect Participial Sentences). The corpus DRIPPS comprises, at the moment, 993 adverbial perfect participial sentences annotated with Discourse Relations and with the following Discourse Relational Devices: connectors, ordering of the clauses, temporal relations, tenses, and aspectual types. Additionally, an application with a Graphical User Interface (GUI) has been developed not only to browse and manipulate the corpus but also to allow the activation of specific Discourse Relation constraints, thereby selecting specific cases from the data set that can be analyzed separately. Besides calculating simple counts and percentages, insightful statistical graphs can be generated and visualized on the fly from the combination of the user-selected constraints and the loaded corpora. The application is pre-loaded with Portuguese and English cases and allows to import/load further cases from different languages/ varieties

Repositório Aberto da Universidade do Porto

A Cluster-Based Opposition Differential Evolution Algorithm Boosted by a Local Search for ECG Signal Classification

Author: Felizardo Virginie
Garcia Nuno M.
Jafari Seyed Nooreddin
Mohammadigheymasi Hamzeh
Mousavirad Seyed Jalaleddin
Pais Sebastião
Pombo Nuno
Pourvahab Mehran
Zacarias Henriques
Publication venue
Publication date: 06/10/2023
Field of study

Electrocardiogram (ECG) signals, which capture the heart's electrical activity, are used to diagnose and monitor cardiac problems. The accurate classification of ECG signals, particularly for distinguishing among various types of arrhythmias and myocardial infarctions, is crucial for the early detection and treatment of heart-related diseases. This paper proposes a novel approach based on an improved differential evolution (DE) algorithm for ECG signal classification for enhancing the performance. In the initial stages of our approach, the preprocessing step is followed by the extraction of several significant features from the ECG signals. These extracted features are then provided as inputs to an enhanced multi-layer perceptron (MLP). While MLPs are still widely used for ECG signal classification, using gradient-based training methods, the most widely used algorithm for the training process, has significant disadvantages, such as the possibility of being stuck in local optimums. This paper employs an enhanced differential evolution (DE) algorithm for the training process as one of the most effective population-based algorithms. To this end, we improved DE based on a clustering-based strategy, opposition-based learning, and a local search. Clustering-based strategies can act as crossover operators, while the goal of the opposition operator is to improve the exploration of the DE algorithm. The weights and biases found by the improved DE algorithm are then fed into six gradient-based local search algorithms. In other words, the weights found by the DE are employed as an initialization point. Therefore, we introduced six different algorithms for the training process (in terms of different local search algorithms). In an extensive set of experiments, we showed that our proposed training algorithm could provide better results than the conventional training algorithms.Comment: 44 pages, 9 figure

arXiv.org e-Print Archive

Microglial Sirtuin 2 shapes long-term potentiation in hippocampal slices

Author: Belo Rita F.
Diógenes Maria José
Fonseca-Gomes João
Miranda-Lourenço Catarina
Pais Teresa F.
Sa de Almeida Joana
Sebastião Ana M
Tanqueiro Sara
Vargas Mariana
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2020
Field of study

Copyright © 2020 Sa de Almeida, Vargas, Fonseca-Gomes, Tanqueiro, Belo, Miranda-Lourenço, Sebastião, Diógenes and Pais. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.Microglial cells have emerged as crucial players in synaptic plasticity during development and adulthood, and also in neurodegenerative and neuroinflammatory conditions. Here we found that decreased levels of Sirtuin 2 (Sirt2) deacetylase in microglia affects hippocampal synaptic plasticity under inflammatory conditions. The results show that long-term potentiation (LTP) magnitude recorded from hippocampal slices of wild type mice does not differ between those exposed to lipopolysaccharide (LPS), a pro-inflammatory stimulus, or BSA. However, LTP recorded from hippocampal slices of microglial-specific Sirt2 deficient (Sirt2-) mice was significantly impaired by LPS. Importantly, LTP values were restored by memantine, an antagonist of N-methyl-D-aspartate (NMDA) receptors. These results indicate that microglial Sirt2 prevents NMDA-mediated excitotoxicity in hippocampal slices in response to an inflammatory signal such as LPS. Overall, our data suggest a key-protective role for microglial Sirt2 in mnesic deficits associated with neuroinflammation.This study was supported by Santa Casa da Misericórdia de Lisboa (MB37-2017), GAPIC Research Program of the University of Lisbon Medical School (n° 2014002 and n° 2015028) and the following doctoral grants: PD/BD/128091/2016, SFRH/BD/118238/2016, PD/BD/114337/2016, and PD/BD/1144- 41/2016.info:eu-repo/semantics/publishedVersio

Universidade de Lisboa: Repositório.UL

As capitanias hereditárias no mapa de Luís Teixeira

Author: BAIÃO António
BAIÃO António
CASTRO Therezinha de
CINTRA Jorge Pimentel
CINTRA Jorge Pimentel
CINTRA Jorge Pimentel
CORTESÃO Jaime
CORTESÃO Jaime
CORTESÃO Jaime
COSTA Melba F.
DIAS Carlos Malheiro
GUEDES Max Justo
GÂNDAVO Pero Magalhães
Jorge Pimentel Cintra
LEME Pedro Taques de Almeida Pais
MADRE DE DEUS
MADRE DE DEUS
MOTA Avelino Teixeira da
PITA Sebastião da Rocha
SALVADOR Frei Vicente do
SOUSA Gabriel Soares de
VARNHAGEN Francisco Adolfo de
Publication venue: Universidade de São Paulo. Museu Paulista
Publication date: 01/12/2015
Field of study

By means of the known map of Luís Teixeira, we analyze the evolution of the hereditary captaincies in their first 50 years. It was found that this map, despite its historical and cartographic value, has several mistakes and inconsistencies. There are errors with regard to the border lines, donatories and historical events. From the cartographic point of view, as other maps of the time, it is quite accurate in latitude but not in longitude. With regard to the dating, task made mainly by the examination of the names of the donatories, it can be said that coexist data from different times and the best explanation for this is that the author collected the most data in 1574 and then updated some information in 1586, but not all. To examine this question, we made an analysis of the evolution of these territories, whose information are dispersed. And this propitiated rectify some knowledge; by example the fact that the Bahia captaincy, in the first years, be constituted only by the city of Salvador; and about the categorization of village or cities, that is not made in the base of the greater or lesser size or importance, but by the fact of be created by the donatory or by the crown. These mistakes may perhaps be explained by the character of the map, similar to some roteiros of the time, and show in a practical way an interesting lesson, yet known theoretically: a map, no matter how beautiful and how historical it is, it can not be considered as an accurate portrayal of the reality or as a photography of the time: as any historical document, should be read carefully and with critical sense.Partindo do conhecido mapa de Luís Teixeira, apresenta-se a evolução das capitanias hereditárias nos primeiros 50 anos. Constatou-se que esse mapa, apesar de seu valor histórico e cartográfico, possui diversos equívocos e inconsistências. Há enganos com relação às linhas divisórias, aos donatários e aos acontecimentos. Do ponto de vista cartográfico, como outros mapas da época, ele é suficientemente acurado em latitude, mas pouco em longitude. Com relação à datação, feita principalmente examinando-se os nomes dos donatários, pode-se dizer que coexistem dados de diferentes épocas, e a melhor explicação é de que o autor coletou dados numa época (1574) e depois atualizou algumas informações (1586), mas não todas. Para examinar essa questão, foi feita uma breve análise das vicissitudes desses territórios, cujas informações se encontram dispersas. Isso permitiu retificar alguns conhecimentos, como, por exemplo, o fato de a Capitania da Bahia, nos primeiros anos, ser formada unicamente pela cidade de Salvador, e que a categorização em vilas ou cidades não se dá pelo maior ou menor tamanho ou importância, mas pelo fato de serem criadas ou pelo donatário ou pela coroa. Esses enganos talvez possam ser explicados pelo caráter do mapa, comparável a alguns "roteiros" da época, e podem mostrar de forma concreta uma interessante lição já conhecida teoricamente: um mapa, por mais belo e histórico que seja, e por mais louvado que tenha sido, não pode ser considerado simploriamente como um retrato fiel da realidade, uma fotografia da época: como qualquer documento histórico, deve ser lido com cuidado e sentido crítico

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Cadernos Espinosanos (E-Journal)

Mesures de similarité distributionnelle asymétrique pour la détection de l’implication textuelle par généralité

Author: Pais Sebastião
Publication venue: HAL CCSD
Publication date: 06/12/2013
Field of study

Textual Entailment aims at capturing major semantic inference needs across applications in Natural Language Processing. Since 2005, in the Textual Entailment recognition (RTE) task, systems are asked to automatically judge whether the meaning of a portion of text, the Text - T, entails the meaning of another text, the Hypothesis - H. This thesis we focus a particular case of entailment, entailment by generality. For us, there are various types of implication, we introduce the paradigm of Textual Entailment by Generality, which can be defined as the entailment from a specific sentence towards a more general sentence, in this context, the Text T entailment Hypothesis H, because H is more general than T. We propose methods unsupervised language-independent for Recognizing Textual Entailment by Generality, for this we present an Informative Asymmetric Measure called the Simplified Asymmetric InfoSimba, which we combine with different asymmetric association measures to recognizingthe specific case of Textual Entailment by Generality.This thesis, we introduce the new concept of implication, implications by generality, in consequence, the new concept of recognition implications by generality, a new direction of research in Natural Language Processing.Textual Entailment vise à capturer les principaux besoins d'inférence sémantique dans les applications de Traitement du Langage Naturel. Depuis 2005, dans la Textual Entailment reconnaissance tâche (RTE), les systèmes sont appelés à juger automatiquement si le sens d'une portion de texte, le texte - T, implique le sens d'un autre texte, l'hypothèse - H. Cette thèse nous nous intéressons au cas particulier de l'implication, l'implication de généralité. Pour nous, il ya différents types d'implication, nous introduisons le paradigme de l'implication textuelle en généralité, qui peut être définie comme l'implication d'une peine spécifique pour une phrase plus générale, dans ce contexte, le texte T implication Hypothèse H, car H est plus générale que T.Nous proposons des méthodes sans surveillance indépendante de la langue de reconnaissance de l'implication textuelle par la généralité, pour cela, nous présentons une mesure asymétrique informatif appelée Asymmetric simplifié InfoSimba, que nous combinons avec différentes mesures d'association asymétriques à reconnaître le cas spécifique de l'implication textuelle par la généralité.Cette thèse, nous introduisons un nouveau concept d'implication, les implications de généralité, en conséquence, le nouveau concept d'implications de la reconnaissance par la généralité, une nouvelle orientation de la recherche en Traitement du Langage Naturel

Thèses en Ligne

thèses en ligne de ParisTech

HAL-MINES ParisTech

Asymmetric Distributional Similarity Measures to Recognize Textual Entailment by Generality

Author: Pais Sebastião
Publication venue
Publication date: 06/12/2013
Field of study

Textual Entailment vise à capturer les principaux besoins d'inférence sémantique dans les applications de Traitement du Langage Naturel. Depuis 2005, dans la Textual Entailment reconnaissance tâche (RTE), les systèmes sont appelés à juger automatiquement si le sens d'une portion de texte, le texte - T, implique le sens d'un autre texte, l'hypothèse - H. Cette thèse nous nous intéressons au cas particulier de l'implication, l'implication de généralité. Pour nous, il ya différents types d'implication, nous introduisons le paradigme de l'implication textuelle en généralité, qui peut être définie comme l'implication d'une peine spécifique pour une phrase plus générale, dans ce contexte, le texte T implication Hypothèse H, car H est plus générale que T.Nous proposons des méthodes sans surveillance indépendante de la langue de reconnaissance de l'implication textuelle par la généralité, pour cela, nous présentons une mesure asymétrique informatif appelée Asymmetric simplifié InfoSimba, que nous combinons avec différentes mesures d'association asymétriques à reconnaître le cas spécifique de l'implication textuelle par la généralité.Cette thèse, nous introduisons un nouveau concept d'implication, les implications de généralité, en conséquence, le nouveau concept d'implications de la reconnaissance par la généralité, une nouvelle orientation de la recherche en Traitement du Langage Naturel.Textual Entailment aims at capturing major semantic inference needs across applications in Natural Language Processing. Since 2005, in the Textual Entailment recognition (RTE) task, systems are asked to automatically judge whether the meaning of a portion of text, the Text - T, entails the meaning of another text, the Hypothesis - H. This thesis we focus a particular case of entailment, entailment by generality. For us, there are various types of implication, we introduce the paradigm of Textual Entailment by Generality, which can be defined as the entailment from a specific sentence towards a more general sentence, in this context, the Text T entailment Hypothesis H, because H is more general than T. We propose methods unsupervised language-independent for Recognizing Textual Entailment by Generality, for this we present an Informative Asymmetric Measure called the Simplified Asymmetric InfoSimba, which we combine with different asymmetric association measures to recognizingthe specific case of Textual Entailment by Generality.This thesis, we introduce the new concept of implication, implications by generality, in consequence, the new concept of recognition implications by generality, a new direction of research in Natural Language Processing

Theses.fr

Asymmetric Attributional Word Similarity Measures to Detect the Relations of Textual Generality

Author: Sebastião Pais
Publication venue: 'MDPI AG'
Publication date: 10/10/2020
Field of study

In this work, we present a new unsupervised and language-independent methodology to detect the relations of textual generality. For this, we introduce a particular case of Textual Entailment (TE), namely Textual Entailment by Generality (TEG). TE aims to capture primary semantic inference needs across applications in Natural Language Processing (NLP). Since 2005, in the TE Recognition (RTE) task, systems have been asked to automatically judge whether the meaning of a portion of the text, the Text (T), entails the meaning of another text, the Hypothesis (H). Several novel approaches and improvements in TE technologies demonstrated in RTE Challenges are signaling renewed interest towards a more in-depth and better understanding of the core phenomena involved in TE. In line with this direction, in this work, we focus on a particular case of entailment, entailment by generality, to detect the relations of textual generality. In text, there are different kinds of entailments, yielded from different types of implicative reasoning (lexical, syntactical, common sense based), but here, we focus just on TEG, which can be defined as an entailment from a specific statement towards a relatively more general one. Therefore, we have T→GH whenever the premise T entails the hypothesis H, this also being more general than the premise. We propose an unsupervised and language-independent method to recognize TEGs, from a pair &lang;T,H&rang; having an entailment relation. To this end, we introduce an Informative Asymmetric Measure (IAM) called Simplified Asymmetric InfoSimba (AISs), which we combine with different Asymmetric Association Measures (AAM). In this work, we hypothesize about the existence of a particular mode of TE, namely TEG. Thus, the main contribution of our study is highlighting the importance of this inference mechanism. Consequently, the new annotation data seem to be a valuable resource for the community

Multidisciplinary Digital Publishing Institute

Hydrogen-atom abstractions: a semi-empirical approach to reaction energetics, bond lengths and bond-orders

Author: Arnaut Luís G.
Formosinho Sebastião J.
Pais Alberto A. C. C.
Publication venue: 'Royal Society of Chemistry (RSC)'
Publication date: 01/01/1998
Field of study

We propose the use of the Intersecting-State Model (ISM) to estimate activation barriers and reactive bond distances for reactions involving the transfer of hydrogen atoms. The method is used in a variety of systems with transition states of the (H)C–H–C(H), N–H–C(H), O–H–C(H), S–H–C(H), Si–H–C, Si–H–Si, Sn–H–C and Ge–H–C types. Hydrogen abstractions by halogen atoms are also investigated. Results are compared with available experimental, semi-empirical or ab initio data. Other transition state types (such as O–H–O) which cannot be properly rationalized in the light of an elementary bond-breaking/bond-forming process are also analyzed.Junta Nacional de Investigação Científica; PRAXIS/2/2.1/QUI/390/94

Crossref

Estudo Geral